Project description

We are studying the effect of an inhibitor of the cGAS-STING signaling pathway H151 on T-ALL model cell line Jurkat.

The biological experiment revealed that H151 causes cell death, so the pathway is important for survival of T-ALL.

We want to explore the differential expressed genes between normal condition and by inhibiting the pathway. Thus we are going to perform a differential expression analysis (DEA) followed by a pathway enrichment analysis (PEA)

Following a basic RNA-seq pipeline analysis

Importing data

Raw counts

Metadata

Experimental data

Data exploration

First let’s view the distribution of the different bio types we have in our data :

In the downstream analysis (DEA), we’ll be focusing on the top 2 biotypes (protein_coding and lncRNA). Additional filtering will be applied :

  • MaxCount_threshold = 20 (At least 1 sample must have a read count over that value)
  • CpmCount_threshold = 0.5 (Count per million reads threshold)
  • MinSample = 3 (Samples that should pass the cpm threshold)



Now to understand the global gene expression landscape and to assess the quality control of our data, we need to perform a dimentionality reduction analysis : Principal component analysis (PCA)

The PCA :

  • Assumes homoscedasticity (equal variance) between the samples
  • Assumes roughly normal distribution

So we need to perform some data transformation first on raw counts.

We’ll be using a Variance Stabilizing Transformation (VST) from the DESeq2 package. And This will :

  • Normalize for library size (DESeq2 computes size factors using a median-of-ratios method)
  • Use a negative binomial model to estimate dispersion
  • Apply a variance-stabilizing mathematical transformation

Differential expression analysis (DEA)

Now let’s dive into gene expression.

dds <- pre_process_results$DESeqData
dds$Condition <- relevel(dds$Condition, ref = "Jurkat_ct") ## set the control
dds <- DESeq(dds) 

Let’s get a quick overview of the distribution of the DEGs

  • Volcano plot:

  • Heatmap:

Let’s draw a heatmap of the top 20up and 20down genes



DEGs list overview

  • Up regulated genes: 261


1- protein-coding genes

##   [1] "CXCR3"           "APLN"            "H2BC21"          "RGS6"           
##   [5] "BHLHE40"         "COL6A3"          "VGF"             "SLAMF8"         
##   [9] "KANK4"           "WNT8B"           "TMEM240"         "MSLNL"          
##  [13] "TMEM163"         "NANOS1"          "RPRML"           "CCR8"           
##  [17] "KIT"             "TMEM269"         "TERT"            "CLEC12A"        
##  [21] "RNASEH2A"        "DTL"             "UNG"             "MCM4"           
##  [25] "TCHH"            "LVRN"            "GPR146"          "ENSG00000289128"
##  [29] "CXXC4"           "REEP1"           "LRRC10B"         "DDIT4"          
##  [33] "H4C9"            "TINCR"           "DDT"             "SCLY"           
##  [37] "HERC3"           "PTPN6"           "SLC37A2"         "TRARG1"         
##  [41] "DENND2D"         "TMEM255B"        "PCNA"            "GET1"           
##  [45] "MYT1L"           "UHRF1"           "C1S"             "ELFN2"          
##  [49] "FAM81A"          "SASH3"           "JAKMIP1"         "TWIST1"         
##  [53] "ABTB3"           "ADGRL2"          "MYO18B"          "ENSG00000102409"
##  [57] "CSPG5"           "LNP1"            "GNA15"           "MCM2"           
##  [61] "P2RX6"           "FA2H"            "MCM3"            "OPN3"           
##  [65] "SLBP"            "NECAB3"          "EID2B"           "PDE4A"          
##  [69] "MCM7"            "IGLL1"           "NOTCH1"          "PPM1H"          
##  [73] "PDXP"            "JMJD7"           "DTX1"            "CALML4"         
##  [77] "FEN1"            "RMI2"            "KRT72"           "HK2"            
##  [81] "GNMT"            "TCF7"            "PIGW"            "ATP2B2"         
##  [85] "IMPDH2"          "ZNF704"          "H3C10"           "OXTR"           
##  [89] "PANO1"           "FBXL22"          "TTC7A"           "NT5DC2"         
##  [93] "POLG"            "NPIPB13"         "ANKRD37"         "NUP210"         
##  [97] "WT1"             "ZMYND19"         "RHPN2"           "CXXC5"          
## [101] "NCBP2AS2"        "NPIPB11"         "GNG4"            "NOTCH2NLB"      
## [105] "IVNS1ABP"

2- lncRNA genes

##   [1] "MIR210HG"        "ENSG00000288930" "RANBP3-DT"       "IRAIN"          
##   [5] "ENSG00000196465" "ENSG00000181577" "ENSG00000146223" "LINC00484"      
##   [9] "ENSG00000125651" "ENSG00000211454" "ENSG00000236204" "ENSG00000213139"
##  [13] "LINC01963"       "ENSG00000170571" "ENSG00000086300" "ENSG00000136840"
##  [17] "ENSG00000290124" "ENSG00000269968" "ENSG00000230148" "ILF3-DT"        
##  [21] "ENSG00000173914" "ENSG00000171357" "ENSG00000138614" "PKD1P6-NPIPP1"  
##  [25] "ENSG00000261226" "ENSG00000183979" "ENSG00000165028" "ENSG00000182197"
##  [29] "ASH1L-AS1"       "ENSG00000205903" "TOLLIP-DT"       "MIF4GD-DT"      
##  [33] "ENSG00000272356" "ZNF674-AS1"      "ENSG00000198034" "ENSG00000143971"
##  [37] "YTHDF3-DT"       "ENSG00000287362" "ENSG00000277767" "FAM30A"         
##  [41] "ENSG00000233178" "ENSG00000276390" "ENSG00000187164" "ENSG00000142556"
##  [45] "ENSG00000124370" "ENSG00000167863" "ENSG00000224066" "ENSG00000151093"
##  [49] "ENSG00000165475" "C16orf95-DT"     "ENSG00000289985" "ZNF346-IT1"     
##  [53] "ENSG00000259736" "ENSG00000288586" "ENSG00000278611" "ENSG00000140092"
##  [57] "ENSG00000231527" "FOXD2-AS1"       "LINC00641"       "LRRC8D-DT"      
##  [61] "NIPBL-DT"        "ENSG00000138738" "ENSG00000251247" "ENSG00000183891"
##  [65] "ENSG00000163935" "PRKCZ-AS1"       "SLFNL1-AS1"      "ENSG00000137941"
##  [69] "SLC16A4-AS1"     "ZNF710-AS1"      "MAD2L1-DT"       "ENSG00000260442"
##  [73] "ENSG00000244184" "GJD3-AS1"        "TBC1D22A-DT"     "ENSG00000157800"
##  [77] "ENSG00000126107" "MHENCR"          "ENSG00000197548" "ENSG00000133794"
##  [81] "SNHG10"          "C9orf163"        "ENSG00000175040" "ENSG00000143740"
##  [85] "ENSG00000105808" "RASSF1-AS1"      "ENSG00000178982" "ENSG00000185621"
##  [89] "PDK1-AS1"        "ENSG00000291056" "SUGT1-DT"        "ENSG00000124140"
##  [93] "LINC01311"       "ENSG00000285437" "ENSG00000100003" "NKILA"          
##  [97] "ENSG00000243479" "ENSG00000100321" "ENSG00000287104" "ENSG00000158555"
## [101] "ENSG00000153936" "MCPH1-DT"        "RNF139-DT"       "ENSG00000173218"
## [105] "ENSG00000155366" "ENSG00000267757" "ENSG00000055950" "ENSG00000162413"
## [109] "MCF2L-AS1"       "ENSG00000246308" "ENSG00000065413" "ENSG00000113595"
## [113] "LINC01284"       "ENSG00000139266" "ENSG00000153561" "ENSG00000118242"
## [117] "GABPB1-IT1"      "CIRBP-AS1"       "ENSG00000267688" "ENSG00000114019"
## [121] "LINC01971"       "ENSG00000068394" "ENSG00000185885" "PARTICL"        
## [125] "LINC00235"       "MIR193BHG"       "ST3GAL1-DT"      "ENSG00000196810"
## [129] "ENSG00000286548" "ENSG00000006194" "PAXIP1-DT"       "ENSG00000106336"
## [133] "LINC00528"       "CNN3-DT"         "ENSG00000198353" "ENSG00000108479"
## [137] "PRKCH-AS1"       "ENSG00000221823" "ENSG00000165915" "ENSG00000160867"
## [141] "LINC02019"       "YEATS2-AS1"      "STAM-DT"         "ENSG00000104219"
## [145] "LINC00304"       "RORB-AS1"        "ENSG00000149823" "LINC01422"      
## [149] "ENSG00000177556" "DNAH10OS"        "LINC01389"       "ENSG00000101336"
## [153] "ENSG00000276853" "ENSG00000213047" "LINC02918"       "DANCR"


  • Down regulated genes: 650


1- protein-coding genes

##   [1] "CEP126"    "ZNF836"    "SLC26A11"  "CLEC18B"   "PCDHGA6"   "SLC8A2"   
##   [7] "PLCD1"     "RNFT1"     "RELB"      "ZBTB7B"    "ANKRD31"   "PCYT1B"   
##  [13] "MACO1"     "NQO2"      "NIPAL1"    "PLTP"      "ZNF585B"   "NPIPA5"   
##  [19] "STPG1"     "DMGDH"     "CFAP69"    "ZNF329"    "APOBEC3G"  "CLIC4"    
##  [25] "C2orf50"   "SERPINI1"  "B3GNT5"    "PLEKHA6"   "GDPD1"     "MAP1A"    
##  [31] "GABRB2"    "DUSP16"    "CCT6B"     "GARIN1A"   "ACTRT3"    "ELL2"     
##  [37] "BCHE"      "SHF"       "TCP11L2"   "ARMH4"     "TMEM79"    "TSPAN10"  
##  [43] "PEX5L"     "MYO5B"     "TBC1D30"   "FGF22"     "RGPD3"     "ANG"      
##  [49] "SPRY3"     "IFI35"     "DOK4"      "IFRD1"     "CASP9"     "CNGA1"    
##  [55] "SPTB"      "ZNF610"    "C3AR1"     "PCDHGA10"  "ERRFI1"    "FRRS1"    
##  [61] "TNS2"      "MPZL3"     "TSNAXIP1"  "EGFL7"     "SCG2"      "SLC16A9"  
##  [67] "DENND2C"   "RSPH4A"    "SLC17A7"   "FZD7"      "MCTP1"     "HOXB6"    
##  [73] "WNK4"      "RAB39B"    "NUDT13"    "LRP12"     "CYP2J2"    "WDR31"    
##  [79] "CTSG"      "ID1"       "SLCO3A1"   "SYNE4"     "CD84"      "SLC38A3"  
##  [85] "FAM219A"   "CYSRT1"    "SERTAD1"   "DNAAF8"    "LAT2"      "CA13"     
##  [91] "DNAJC3"    "TNN"       "ERBB2"     "CAPS"      "JAM2"      "ULBP2"    
##  [97] "RXRA"      "FLVCR2"    "PILRA"     "C11orf65"  "SPNS3"     "ICOS"     
## [103] "CBS"       "EVI2A"     "RTN2"      "ADAM8"     "IL20RB"    "BEND6"    
## [109] "FTH1"      "TMEM151A"  "TMEM267"   "GCNT4"     "PEX11G"    "C19orf38" 
## [115] "DNAI7"     "DDRGK1"    "CCDC184"   "P2RY10"    "TMBIM1"    "OASL"     
## [121] "FOSB"      "DBNDD1"    "CCDC69"    "NECTIN3"   "JAML"      "IRAK2"    
## [127] "EFNB2"     "TAGLN"     "BVES"      "SYNGR4"    "CACNG8"    "CCNB3"    
## [133] "FMO4"      "PEAR1"     "SLC43A1"   "MIA2"      "RAB17"     "CDS1"     
## [139] "EVA1B"     "TPST1"     "CYB5R1"    "C14orf39"  "NES"       "APOL1"    
## [145] "SSC4D"     "TXNRD1"    "C22orf23"  "METTL27"   "EHD2"      "PDGFRL"   
## [151] "MAML2"     "ZNF235"    "GAB2"      "CGRRF1"    "DAPP1"     "LCA5L"    
## [157] "DNAJB9"    "IRAG2"     "CCDC65"    "GPT2"      "SLC9A9"    "WIPI1"    
## [163] "EPHX1"     "MN1"       "SEL1L3"    "NUTM2E"    "RBP5"      "BBC3"     
## [169] "EPS8"      "NEK5"      "ASS1"      "ARHGAP29"  "RGS17"     "EIF4EBP1" 
## [175] "ABHD3"     "TXK"       "EID3"      "CTSO"      "PGGHG"     "ACTN2"    
## [181] "BMP10"     "KIF5C"     "CYP1A1"    "MFAP3L"    "TRIM22"    "LAMC3"    
## [187] "ZFP2"      "ZNF582"    "PCLO"      "PLEKHH3"   "EAF2"      "NAGS"     
## [193] "ATF3"      "HSPA5"     "HABP2"     "GCLM"      "HERPUD1"   "IL23A"    
## [199] "TIAM2"     "TM6SF1"    "DLL1"      "ADGRB1"    "SLC2A12"   "TLR3"     
## [205] "TLR5"      "DGKG"      "ASNS"      "NKX2-2"    "FAM43A"    "IL15RA"   
## [211] "PTGER3"    "CDKN1A"    "C7orf31"   "FAAH2"     "TFE3"      "ABHD4"    
## [217] "SLC48A1"   "IRF7"      "HRH1"      "AMIGO2"    "AARD"      "CILP2"    
## [223] "PCK2"      "SMPDL3B"   "DMRTA2"    "TNFSF9"    "UBXN8"     "PSAT1"    
## [229] "NPAS1"     "CECR2"     "EFNA1"     "ENTPD1"    "PLCL1"     "CD55"     
## [235] "PRG4"      "SV2B"      "CREB3L3"   "REPS2"     "ETFBKMT"   "ISG20"    
## [241] "PPFIBP2"   "INCA1"     "TMOD1"     "ARRDC3"    "SYT1"      "SLC1A5"   
## [247] "SESN2"     "YPEL4"     "ULBP1"     "HRK"       "PPP2R3A"   "FRMD3"    
## [253] "TCAF2"     "NOS1AP"    "DRC3"      "RBM11"     "SNAI1"     "POF1B"    
## [259] "SMTNL2"    "NCF2"      "ZNF880"    "NCAM1"     "TMEM200A"  "LY9"      
## [265] "IQUB"      "MT1F"      "DRAM1"     "SMOX"      "CLIP2"     "CEBPB"    
## [271] "TRPC5OS"   "AMOT"      "DAB1"      "RASD1"     "MET"       "SGIP1"    
## [277] "PHOSPHO1"  "GABARAPL1" "NECTIN2"   "IL7R"      "CCER2"     "FUT1"     
## [283] "WNT10A"    "MYH14"     "NIBAN1"    "TMEM217"   "PLAU"      "MYOM2"    
## [289] "DEPTOR"    "PMEL"      "FLRT1"     "SMAD7"     "ADGRL3"    "PRKG2"    
## [295] "GLS2"      "LDB3"      "PCDHB14"   "MPZ"       "TMEM156"   "TRAF3IP2" 
## [301] "TLL1"      "MFAP4"     "MAFB"      "SLC30A1"   "TEX14"     "KANK3"    
## [307] "LPIN3"     "THEMIS2"   "SSC5D"     "SHISA2"    "FMN1"      "SCN3A"    
## [313] "LACC1"     "LPAR4"     "NDUFA4L2"  "TMEM74B"   "XKRX"      "CDC20B"   
## [319] "HYDIN"     "PCDHGB7"   "TBL1X"     "ST8SIA6"   "SCN3B"     "RAD9B"    
## [325] "ADGRE1"    "STC2"      "TLR1"      "ITGB5"     "JDP2"      "NR2F2"    
## [331] "CRPPA"     "ERBB3"     "CDNF"      "ANXA3"     "GBP2"      "CALCRL"   
## [337] "TNFRSF11A" "ZNF703"    "CHRNB4"    "ADM2"      "HSPA12B"   "TRPC1"    
## [343] "LMX1B"     "PTGES"     "REXO5"     "CATSPERG"  "PCDHGB6"   "CREB5"    
## [349] "TEX19"     "TGFBR3"    "SLC45A1"   "COLGALT2"  "SYT11"     "SH2D6"    
## [355] "PCDH12"    "RPS6KA2"   "CST7"      "DUSP8"     "TREML2"    "WDR93"    
## [361] "PIFO"      "CDH15"     "TMIE"      "HSPA1B"    "CEACAM1"   "SLC16A6"  
## [367] "CCDC148"   "ZNF35"     "ZFPM2"     "CFTR"      "PCDH19"    "SP140"    
## [373] "GPAT3"     "PPP1R15A"  "FBXO39"    "CXCL3"     "TRAT1"     "SLC3A2"   
## [379] "SORBS1"    "MTTP"      "KIF17"     "ADTRP"     "ZBTB46"    "RGS16"    
## [385] "PIWIL4"    "FLRT3"     "JAKMIP2"   "TMEM140"   "SPTA1"     "CSTA"     
## [391] "TC2N"      "SCN4B"     "PLAC1"     "JAKMIP3"   "FOXA3"     "PRDM1"    
## [397] "HTR2B"     "F2RL2"     "NQO1"      "SOX6"      "PATJ"      "CHAC1"    
## [403] "GPR18"     "FAM166B"   "CRB3"      "MKX"       "TLR6"      "SCN4A"    
## [409] "CLCA1"     "LY96"      "IL31RA"    "NYAP1"     "DPP4"      "TRPM6"    
## [415] "CCPG1"     "AKR1C3"    "TRPS1"     "CEACAM21"  "IL12A"     "ID2"      
## [421] "OLAH"      "MAP3K8"    "GLIPR1"    "TRIB3"     "STYK1"     "MATN4"    
## [427] "BEST1"     "CCDC110"   "TSC22D3"   "KCNN3"     "FAM133A"   "CLU"      
## [433] "PDE11A"    "ADRA1A"    "JUN"       "CHRNA6"    "DDIT3"     "LRRC2"    
## [439] "HEPH"      "ILDR1"     "SATL1"     "GPR132"    "FOSL1"     "XKR3"     
## [445] "SLC6A9"    "INPP5J"    "GPX2"      "AKR1C8"    "CRIM1"     "UEVLD"    
## [451] "SLC7A11"   "TSPAN19"   "CYP19A1"   "PANX2"     "CD80"      "BMP6"     
## [457] "AKR1C2"    "OSGIN1"    "MMP8"      "EGF"       "HRG"       "GDAP1L1"  
## [463] "ABCC3"     "UCP1"      "HSPA6"     "RHOB"      "RASSF6"    "HMOX1"    
## [469] "C4orf17"

2- lncRNA genes

##   [1] "RAP2C-AS1"       "DDX19A-DT"       "SP2-DT"          "ENSG00000204482"
##   [5] "LINC01058"       "LINC00652"       "LINC01134"       "LINC01215"      
##   [9] "LINC-PINT"       "USP27X-DT"       "ENSG00000289958" "RPAP3-DT"       
##  [13] "ENSG00000138061" "ENSG00000272438" "ERCC6L2-AS1"     "PCAT1"          
##  [17] "ENSG00000102245" "SNRPA1-DT"       "ACTR3-AS1"       "ENSG00000261173"
##  [21] "LINC01115"       "CPEB1-AS1"       "ENSG00000177788" "PRDM8-AS1"      
##  [25] "MIR4280HG"       "LINC00662"       "ENSG00000291010" "TWF2-DT"        
##  [29] "RORA-AS1"        "ENSG00000178896" "COPB2-DT"        "ZNF225-AS1"     
##  [33] "ENSG00000197024" "LNCRNA-IUR"      "LINC01465"       "LINC00853"      
##  [37] "MIR4435-2HG"     "ENSG00000105698" "ENSG00000213588" "ENSG00000213465"
##  [41] "FAM174A-DT"      "ZNF451-AS1"      "LINC02709"       "HDAC2-AS2"      
##  [45] "ENSG00000260874" "ENSG00000125775" "PCOTH"           "BAZ2B-AS1"      
##  [49] "BTG1-DT"         "ENSG00000228436" "SDCBP2-AS1"      "ENSG00000140104"
##  [53] "ENSG00000108599" "ENSG00000275793" "TMEM254-AS1"     "ENSG00000285706"
##  [57] "ENSG00000288996" "HULC"            "ALG1L9P"         "ZFAND2A-DT"     
##  [61] "LINC01825"       "RPL37A-DT"       "SAMD12-AS1"      "ENSG00000162543"
##  [65] "LINC00877"       "LYPLAL1-DT"      "MKLN1-AS"        "BTG2-DT"        
##  [69] "ENSG00000188761" "ENSG00000283141" "ECE1-AS1"        "DLGAP1-AS2"     
##  [73] "WEE2-AS1"        "ENSG00000021574" "OXR1-AS1"        "LINC00680"      
##  [77] "ENSG00000289183" "ENSG00000273183" "SPOPL-DT"        "LINC02901"      
##  [81] "LGALS8-AS1"      "FBXO38-DT"       "LINC01694"       "ENSG00000102543"
##  [85] "LINC00659"       "ENSG00000204356" "ENSG00000198752" "MIR22HG"        
##  [89] "LINC00511"       "ENSG00000243811" "LINC00239"       "LINC00882"      
##  [93] "ENSG00000214770" "ENSG00000291136" "SLC12A5-AS1"     "ENSG00000120071"
##  [97] "MAGOH-DT"        "MIAT"            "YWHAH-AS1"       "KRT10-AS1"      
## [101] "ENO1-AS1"        "LINC00685"       "LINC01277"       "ENSG00000291027"
## [105] "KDSR-DT"         "ENSG00000006634" "ENSG00000160703" "TTTY14"         
## [109] "LINC02265"       "ENSG00000226029" "CEBPB-AS1"       "ENSG00000159496"
## [113] "FKTN-AS1"        "LINC01307"       "ZNF516-AS1"      "ENSG00000139800"
## [117] "LACTB2-AS1"      "ENSG00000289115" "LINC02377"       "ZBED3-AS1"      
## [121] "ENSG00000214046" "ENSG00000115355" "ENSG00000161016" "LINC00029"      
## [125] "ENSG00000119684" "RPL26L1-AS1"     "PRRT3-AS1"       "CBR3-AS1"       
## [129] "JDP2-AS1"        "ENSG00000172071" "ENSG00000110063" "PRMT5-DT"       
## [133] "ENSG00000124172" "ENSG00000205643" "ENSG00000287808" "ENSG00000163794"
## [137] "GAS1RR"          "ENSG00000149451" "ENSG00000284308" "ENSG00000135436"
## [141] "VLDLR-AS1"       "ENSG00000227500" "ENSG00000180611" "LINC00992"      
## [145] "UBQLN1-AS1"      "ENSG00000157992" "LINC02341"       "LINC02018"      
## [149] "NQO1-DT"         "LINC00648"       "ENSG00000158406" "ENSG00000198888"
## [153] "ECI2-DT"         "ERICH2-DT"       "ENSG00000105516" "MDN1-AS1"       
## [157] "ENSG00000254510" "LINC02321"       "HIPK1-AS1"       "ENSG00000196132"
## [161] "ENSG00000188739" "PRPF19-DT"       "ENSG00000171735" "PCDH10-DT"      
## [165] "ENSG00000219200" "ENSG00000132005" "LINC00632"       "ROCK1P1"        
## [169] "SLC26A4-AS1"     "ENSG00000263528" "LUCAT1"          "ENSG00000272010"
## [173] "ENSG00000259905" "ENSG00000163154" "LINC02561"       "STX5-DT"        
## [177] "LINC02539"       "L3MBTL2-AS1"     "ENSG00000290683" "LINC00365"      
## [181] "NMRAL2P"


Now inspecting the gene biotypes



Let’s inspect the top up and down genes deeper !

Pathway enrichment analysis (PEA)

Gene Ontology (GO)

NOTE:

For the downstream analysis, only protein_coding genes will be kept to ensure the significance of the resulting terms!

We will be using first all the protein_coding DEGs (574) then only Down-regulated one (469)


All DEGs

Let’s inspect the pathways enriched with all the DEGs (up & down) that are protein_coding

Results


The top 40 most significant pathways enriched with these DEGs are

##  [1] "leukocyte activation"                                     
##  [2] "lymphocyte activation"                                    
##  [3] "multicellular organismal-level homeostasis"               
##  [4] "regulation of cell activation"                            
##  [5] "regulation of leukocyte activation"                       
##  [6] "cell-cell adhesion"                                       
##  [7] "regulation of leukocyte differentiation"                  
##  [8] "T cell activation"                                        
##  [9] "myeloid leukocyte activation"                             
## [10] "blood vessel morphogenesis"                               
## [11] "regulation of hemopoiesis"                                
## [12] "response to endoplasmic reticulum stress"                 
## [13] "vasculature development"                                  
## [14] "response to lipid"                                        
## [15] "negative regulation of immune system process"             
## [16] "regulation of T cell mediated immunity"                   
## [17] "response to ketone"                                       
## [18] "leukocyte differentiation"                                
## [19] "integrated stress response signaling"                     
## [20] "epithelial cell proliferation"                            
## [21] "blood vessel development"                                 
## [22] "circulatory system process"                               
## [23] "response to organic cyclic compound"                      
## [24] "mononuclear cell differentiation"                         
## [25] "lymphocyte differentiation"                               
## [26] "endothelial cell migration"                               
## [27] "response to topologically incorrect protein"              
## [28] "angiogenesis"                                             
## [29] "regulation of lymphocyte differentiation"                 
## [30] "secretion"                                                
## [31] "positive regulation of cytokine production"               
## [32] "response to unfolded protein"                             
## [33] "inflammatory response"                                    
## [34] "regulation of lymphocyte activation"                      
## [35] "secondary metabolic process"                              
## [36] "regulation of cell adhesion"                              
## [37] "tube morphogenesis"                                       
## [38] "cell-cell adhesion via plasma-membrane adhesion molecules"
## [39] "hemopoiesis"                                              
## [40] "immune effector process"


The top 40 pathway with the highest RichFactor (Count per pathway length ratio)

##  [1] "double-strand break repair via break-induced replication"                    
##  [2] "regulation of ferroptosis"                                                   
##  [3] "regulation of heat generation"                                               
##  [4] "detection of molecule of bacterial origin"                                   
##  [5] "ferroptosis"                                                                 
##  [6] "DNA strand elongation involved in DNA replication"                           
##  [7] "epithelial cell fate commitment"                                             
##  [8] "regulation of IRE1-mediated unfolded protein response"                       
##  [9] "regulation of DNA-templated DNA replication initiation"                      
## [10] "negative regulation of endoplasmic reticulum unfolded protein response"      
## [11] "heat generation"                                                             
## [12] "negative regulation of leukocyte chemotaxis"                                 
## [13] "melanin biosynthetic process"                                                
## [14] "melanin metabolic process"                                                   
## [15] "secondary metabolite biosynthetic process"                                   
## [16] "integrated stress response signaling"                                        
## [17] "regulation of T cell mediated cytotoxicity"                                  
## [18] "secondary metabolic process"                                                 
## [19] "leukocyte activation involved in inflammatory response"                      
## [20] "negative regulation of chemotaxis"                                           
## [21] "negative regulation of leukocyte migration"                                  
## [22] "T cell mediated cytotoxicity"                                                
## [23] "regulation of T cell mediated immunity"                                      
## [24] "toll-like receptor signaling pathway"                                        
## [25] "endoplasmic reticulum unfolded protein response"                             
## [26] "female gonad development"                                                    
## [27] "development of primary female sexual characteristics"                        
## [28] "response to unfolded protein"                                                
## [29] "macrophage activation"                                                       
## [30] "positive regulation of tumor necrosis factor superfamily cytokine production"
## [31] "response to topologically incorrect protein"                                 
## [32] "T cell mediated immunity"                                                    
## [33] "negative regulation of immune effector process"                              
## [34] "response to glucocorticoid"                                                  
## [35] "myeloid leukocyte activation"                                                
## [36] "response to ketone"                                                          
## [37] "homophilic cell adhesion via plasma membrane adhesion molecules"             
## [38] "response to corticosteroid"                                                  
## [39] "negative regulation of leukocyte activation"                                 
## [40] "regulation of lymphocyte differentiation"



Dotplot



Cnet plot




Down DEGs

Results


The top 50 most significant pathways enriched with these DEGs are

##  [1] "leukocyte activation"                                           
##  [2] "response to endoplasmic reticulum stress"                       
##  [3] "regulation of cell activation"                                  
##  [4] "cell-cell adhesion"                                             
##  [5] "multicellular organismal-level homeostasis"                     
##  [6] "lymphocyte activation"                                          
##  [7] "regulation of leukocyte activation"                             
##  [8] "integrated stress response signaling"                           
##  [9] "myeloid leukocyte activation"                                   
## [10] "response to lipid"                                              
## [11] "response to topologically incorrect protein"                    
## [12] "response to unfolded protein"                                   
## [13] "circulatory system process"                                     
## [14] "T cell activation"                                              
## [15] "endothelial cell migration"                                     
## [16] "regulation of T cell mediated immunity"                         
## [17] "cell-cell adhesion via plasma-membrane adhesion molecules"      
## [18] "regulation of leukocyte differentiation"                        
## [19] "positive regulation of response to external stimulus"           
## [20] "toll-like receptor signaling pathway"                           
## [21] "response to nutrient levels"                                    
## [22] "negative regulation of immune system process"                   
## [23] "homophilic cell adhesion via plasma membrane adhesion molecules"
## [24] "regulation of ferroptosis"                                      
## [25] "endoplasmic reticulum unfolded protein response"                
## [26] "regulation of T cell mediated cytotoxicity"                     
## [27] "leukocyte activation involved in inflammatory response"         
## [28] "response to ketone"                                             
## [29] "regulation of response to biotic stimulus"                      
## [30] "regulation of heat generation"                                  
## [31] "detection of molecule of bacterial origin"                      
## [32] "ferroptosis"                                                    
## [33] "leukocyte differentiation"                                      
## [34] "inflammatory response"                                          
## [35] "positive regulation of cytokine production"                     
## [36] "macrophage activation"                                          
## [37] "viral entry into host cell"                                     
## [38] "cell junction organization"                                     
## [39] "T cell mediated immunity"                                       
## [40] "regulation of hemopoiesis"                                      
## [41] "response to molecule of bacterial origin"                       
## [42] "regulation of IRE1-mediated unfolded protein response"          
## [43] "entry into host"                                                
## [44] "positive regulation of defense response"                        
## [45] "regulation of cell adhesion"                                    
## [46] "cellular response to unfolded protein"                          
## [47] "blood circulation"                                              
## [48] "blood vessel morphogenesis"                                     
## [49] "negative regulation of cell activation"                         
## [50] "T cell mediated cytotoxicity"


The top 50 pathway with the highest RichFactor (Count per pathway length ratio)

##  [1] "regulation of ferroptosis"                                                        
##  [2] "regulation of heat generation"                                                    
##  [3] "detection of molecule of bacterial origin"                                        
##  [4] "ferroptosis"                                                                      
##  [5] "natural killer cell mediated cytotoxicity directed against tumor cell target"     
##  [6] "regulation of natural killer cell mediated immune response to tumor cell"         
##  [7] "anterograde dendritic transport"                                                  
##  [8] "regulation of IRE1-mediated unfolded protein response"                            
##  [9] "negative regulation of endoplasmic reticulum unfolded protein response"           
## [10] "heat generation"                                                                  
## [11] "positive regulation of chondrocyte differentiation"                               
## [12] "PERK-mediated unfolded protein response"                                          
## [13] "IRE1-mediated unfolded protein response"                                          
## [14] "regulation of translation in response to stress"                                  
## [15] "regulation of cardiac muscle cell differentiation"                                
## [16] "integrated stress response signaling"                                             
## [17] "negative regulation of T cell mediated immunity"                                  
## [18] "positive regulation of cartilage development"                                     
## [19] "regulation of endoplasmic reticulum unfolded protein response"                    
## [20] "regulation of T cell mediated cytotoxicity"                                       
## [21] "leukocyte activation involved in inflammatory response"                           
## [22] "negative regulation of response to endoplasmic reticulum stress"                  
## [23] "regulation of acute inflammatory response"                                        
## [24] "microglial cell activation"                                                       
## [25] "amino acid import across plasma membrane"                                         
## [26] "T cell mediated cytotoxicity"                                                     
## [27] "toll-like receptor signaling pathway"                                             
## [28] "intrinsic apoptotic signaling pathway in response to endoplasmic reticulum stress"
## [29] "endoplasmic reticulum unfolded protein response"                                  
## [30] "regulation of T cell mediated immunity"                                           
## [31] "response to unfolded protein"                                                     
## [32] "cellular response to unfolded protein"                                            
## [33] "regulation of response to endoplasmic reticulum stress"                           
## [34] "macrophage activation"                                                            
## [35] "response to topologically incorrect protein"                                      
## [36] "acute inflammatory response"                                                      
## [37] "T cell mediated immunity"                                                         
## [38] "cellular response to topologically incorrect protein"                             
## [39] "chloride transport"                                                               
## [40] "homophilic cell adhesion via plasma membrane adhesion molecules"                  
## [41] "regulation of myeloid leukocyte differentiation"                                  
## [42] "response to endoplasmic reticulum stress"                                         
## [43] "viral entry into host cell"                                                       
## [44] "entry into host"                                                                  
## [45] "myeloid leukocyte activation"                                                     
## [46] "positive regulation of inflammatory response"                                     
## [47] "monoatomic anion transport"                                                       
## [48] "response to ketone"                                                               
## [49] "negative regulation of leukocyte activation"                                      
## [50] "endothelial cell migration"



Dotplot



Cnet plot


Molecular Signature Database (MSigDb)

The MSigDb is a collection of annotated gene sets. It contains 8 major collections:

  • H : hallmark gene sets
  • C1 : positional gene sets
  • C2 : curated gene sets
  • C3 : motifs gene sets
  • C4 : computational gene sets
  • C5 : GO gene sets
  • C6 : oncogenic signatures
  • C7 : immunologic signatures
  • C8 : cell Type Signature
##                                       gs_collection
## gs_collection_name                         C1     C2     C3     C4     C5
##   BioCarta Pathways                         0  11088      0      0      0
##   Cancer Gene Neighborhoods                 0      0      0  42623      0
##   Cancer Modules                            0      0      0  48830      0
##   Canonical Pathways                        0    579      0      0      0
##   Cell Type Signature                       0      0      0      0      0
##   Chemical and Genetic Perturbations        0 442721      0      0      0
##   Curated Cancer Cell Atlas gene sets       0      0      0   7390      0
##   GO Biological Process                     0      0      0      0 617648
##   GO Cellular Component                     0      0      0      0 103371
##   GO Molecular Function                     0      0      0      0 113119
##   GTRD                                      0      0 257418      0      0
##   Hallmark                                  0      0      0      0      0
##   HIPC Vaccine Response                     0      0      0      0      0
##   Human Phenotype Ontology                  0      0      0      0 530149
##   ImmuneSigDB                               0      0      0      0      0
##   KEGG Legacy Pathways                      0  12904      0      0      0
##   KEGG Medicus Pathways                     0   9688      0      0      0
##   MIR_Legacy                                0      0  34266      0      0
##   miRDB                                     0      0 731405      0      0
##   Oncogenic Signature                       0      0      0      0      0
##   PID Pathways                              0   8062      0      0      0
##   Positional                            43883      0      0      0      0
##   Reactome Pathways                         0 108275      0      0      0
##   TFT_Legacy                                0      0 155547      0      0
##   WikiPathways                              0  40070      0      0      0
##                                       gs_collection
## gs_collection_name                         C6     C7     C8      H
##   BioCarta Pathways                         0      0      0      0
##   Cancer Gene Neighborhoods                 0      0      0      0
##   Cancer Modules                            0      0      0      0
##   Canonical Pathways                        0      0      0      0
##   Cell Type Signature                       0      0 157573      0
##   Chemical and Genetic Perturbations        0      0      0      0
##   Curated Cancer Cell Atlas gene sets       0      0      0      0
##   GO Biological Process                     0      0      0      0
##   GO Cellular Component                     0      0      0      0
##   GO Molecular Function                     0      0      0      0
##   GTRD                                      0      0      0      0
##   Hallmark                                  0      0      0   7333
##   HIPC Vaccine Response                     0  44668      0      0
##   Human Phenotype Ontology                  0      0      0      0
##   ImmuneSigDB                               0 948621      0      0
##   KEGG Legacy Pathways                      0      0      0      0
##   KEGG Medicus Pathways                     0      0      0      0
##   MIR_Legacy                                0      0      0      0
##   miRDB                                     0      0      0      0
##   Oncogenic Signature                   30753      0      0      0
##   PID Pathways                              0      0      0      0
##   Positional                                0      0      0      0
##   Reactome Pathways                         0      0      0      0
##   TFT_Legacy                                0      0      0      0
##   WikiPathways                              0      0      0      0



Let’s see the results using different selections and using down regulated genes

## C3 : MIR_Legacy                              : 0 significant terms
## C3 : TFT_Legacy                              : 8 significant terms
## C2 : Chemical and Genetic Perturbations      : 338 significant terms
## C3 : GTRD                                    : 0 significant terms
## C8 : Cell Type Signature                     : 104 significant terms
## C6 : Oncogenic Signature                     : 19 significant terms
## C7 : HIPC Vaccine Response                   : 27 significant terms
## C2 : BioCarta Pathways                       : 0 significant terms
## C4 : Cancer Gene Neighborhoods               : 0 significant terms
## C4 : Curated Cancer Cell Atlas gene sets     : 7 significant terms
## C5 : GO Biological Process                   : 28 significant terms
## C5 : GO Cellular Component                   : 4 significant terms
## C7 : ImmuneSigDB                             : 308 significant terms
## C5 : GO Molecular Function                   : 1 significant terms
## H : Hallmark                                 : 7 significant terms
## C5 : Human Phenotype Ontology                : 0 significant terms
## C2 : KEGG Legacy Pathways                    : 0 significant terms
## C2 : KEGG Medicus Pathways                   : 1 significant terms
## C3 : miRDB                                   : 0 significant terms
## C4 : Cancer Modules                          : 1 significant terms
## C1 : Positional                              : 0 significant terms
## C2 : PID Pathways                            : 0 significant terms
## C2 : Reactome Pathways                       : 9 significant terms
## C2 : Canonical Pathways                      : 0 significant terms
## C2 : WikiPathways                            : 9 significant terms



- C6 : Oncogenic signatures



- C7 : Immunologic signatures



- C5 : GO Biological Process



- C8 : Cell Type Signature



- H : Hallmark

Gene Set Enrichment Analysis (GSEA)

GSEA is a functional class scoring method that evaluates whether predefined gene sets show statistically significant, concordant differences between two biological conditions. Unlike over-representation approaches, GSEA uses the entire ranked list of genes, avoiding the need for an arbitrary significance cutoff.

In this analysis, genes were ranked using a composite score based on both:

  • the magnitude and direction of differential expression (log2FoldChange)
  • the statistical significance (adjusted p-value, padj)


Interpretation of the ranking score

  • High positive scores correspond to the most significant up-regulated genes
  • High negative scores correspond to the most significant down-regulated genes
  • Scores close to zero represent non-significant genes


The ranking score was computed according to the following rule:

ranking <- case_when(
  abs(log2FoldChange) >= 3 & padj < 0.5 ~
    round((sign(log2FoldChange) * 3) * (-log10(padj) / 10), 4),

  abs(log2FoldChange) < 3 & padj < 0.5 ~
    round(log2FoldChange * (-log10(padj) / 10), 4),

  padj >= 0.5 ~ 0
)

So basically

\[Score=log2​(FC)×(−log10​(padj​))\] Where beyond an absolute log2FoldChange value of 3, the ranking metric is only driven by the statistical significance padj.

Genes with padj ≥ 0.5 are assigned a score of zero, effectively positioning them in the middle of the ranked list.

This scoring strategy allows a balanced integration of effect size and statistical confidence, making it suitable for downstream GSEA


For this analysis we going to use gene sets from the MSigDB (as already seen above)

  • H : hallmark gene sets
  • C5 : GO gene sets
  • C6 : oncogenic signatures

Hallmarks

Below is the results represented in a dot plot



Now let’s view the enrichment plots


C5: GO gene sets

Below is the results represented in a dot plot



Now let’s view the enrichment plots


C6: oncogenic signatures

Below is the results represented in a dot plot



Now let’s view the enrichment plots